Goto

Collaborating Authors

 Social Media


CausalStock: Deep End-to-end Causal Discovery for News-driven Stock Movement Prediction Yuxin Lin

Neural Information Processing Systems

There are two issues in news-driven multi-stock movement prediction tasks that are not well solved in the existing works. On the one hand, "relation discovery" is a pivotal part when leveraging the price information of other stocks to achieve accurate stock movement prediction. Given that stock relations are often unidirectional, such as the "supplier-consumer" relationship, causal relations are more appropriate to capture the impact between stocks. On the other hand, there is substantial noise existing in the news data leading to extracting effective information with difficulty. With these two issues in mind, we propose a novel framework called CausalStock for news-driven multi-stock movement prediction, which discovers the temporal causal relations between stocks.


Baxter Permutation Process

Neural Information Processing Systems

In this paper, a Bayesian nonparametric (BNP) model for Baxter permutations (BPs), termed BP process (BPP) is proposed and applied to relational data analysis. The BPs are a well-studied class of permutations, and it has been demonstrated that there is one-to-one correspondence between BPs and several interesting objects including floorplan partitioning (FP), which constitutes a subset of rectangular partitioning (RP). Accordingly, the BPP can be used as an FP model. We combine the BPP with a multi-dimensional extension of the stick-breaking process called the block-breaking process to fill the gap between FP and RP, and obtain a stochastic process on arbitrary RPs. Compared with conventional BNP models for arbitrary RPs, the proposed model is simpler and has a high affinity with Bayesian inference.


A More Motivating Examples and Analysis

Neural Information Processing Systems

Additional explanations for Table 1 We provide two real examples of how graph properties can decide important graph patterns as the signals for graph classifications. The first example is about social networks. This may be an important signal for effective graph classification, which is consistent with commonly used social network classification methods such as motif-counting. On the other hand, if a set of random graphs does not possess high CC, the prominent graph patterns might be some subtrees or long edges, where the training of a graph classification model is not likely to facilitate the capturing of triangles, and thus not beneficial for the classification of social networks. Although the two datasets are from totally different domains, they are both formed by spatial structures, which is reflected in Table 1 where the two datasets have very close values regarding multiple properties (degree distribution, shortest path length, CC).


Zero-Resource Knowledge-Grounded Dialogue Generation Wei Wu Peking University Microsoft STCA Meituan

Neural Information Processing Systems

While neural conversation models have shown great potentials towards generating informative and engaging responses via introducing external knowledge, learning such a model often requires knowledge-grounded dialogues that are difficult to obtain. To overcome the data challenge and reduce the cost of building a knowledgegrounded dialogue system, we explore the problem under a zero-resource setting by assuming no context-knowledge-response triples are needed for training. To this end, we propose representing the knowledge that bridges a context and a response and the way that the knowledge is expressed as latent variables, and devise a variational approach that can effectively estimate a generation model from a dialogue corpus and a knowledge corpus that are independent with each other. Evaluation results on three benchmarks of knowledge-grounded dialogue generation indicate that our model can achieve comparable performance with stateof-the-art methods that rely on knowledge-grounded dialogues for training, and exhibits a good generalization ability over different topics and different datasets.


609c5e5089a9aa967232aba2a4d03114-AuthorFeedback.pdf

Neural Information Processing Systems

For all Reviewers: Thank you for the valuable comments that help us improve the work. Wizard) to train a knowledge-grounded generation model. GT-knowledge in the input K knowledge sentences on WoW seen and WoW unseen are 37.7% and 37.4% respectively. Finally, to speed up training, we use the number 10. CMU_DoG (pseudo supervision created by selecting GT-knowledge using Sim(.,.) with the response), and the results For REALM, the notification date of ICML 2020 is quite close to the submission date of NeurlPS 2020. For Reviewer #2: We will follow your suggestions on the improvement of clarity in the final version.


Let AI fix your stock portfolio (and your anxiety)

Mashable

TL;DR: Sterling Stock Picker gives you AI-powered investment advice, a portfolio builder, and plain-English explanations for a lifetime -- now just 55.19 with code SAVE20. I was watching the first of the month's market crashes and wondered, "Does it make sense to invest right now?" I'd always been curious about the stock market, at least in terms of how people actually got rich by practically gambling, but I didn't have a single clue where to begin -- let alone how to even buy a stock. Once I saw a TikTok calling this period a once-in-a-lifetime opportunity (take that with a grain of salt), I decided to take my chance. But I needed help researching everything, like what stocks to choose and how to track them. That's when I found Sterling Stock Picker, and made my first investment.


The Scandinavian Embedding Benchmarks: Evaluating Multilingual and Monolingual Text Embedding for Scandinavian languages Kenneth Enevoldsen

Neural Information Processing Systems

The evaluation of English text embeddings has transitioned from evaluating a handful of datasets to broad coverage across many tasks through benchmarks such as MTEB. However, this is not the case for multilingual text embeddings due to a lack of available benchmarks. To address this problem, we introduce the Scandinavian Embedding Benchmark (SEB). SEB is a framework that enables text embedding evaluation for Scandinavian languages across 24 tasks, 10 subtasks, and 4 task categories. Building on SEB, we evaluate more than 26 models, uncovering significant performance disparities between public and commercial solutions not previously captured by MTEB.


Learning Action and Reasoning-Centric Image Editing from Videos and Simulations Luis Lara

Neural Information Processing Systems

An image editing model should be able to perform diverse edits, ranging from object replacement, changing attributes or style, to performing actions or movement, which require many forms of reasoning. Current general instruction-guided editing models have significant shortcomings with action and reasoning-centric edits. Object, attribute or stylistic changes can be learned from visually static datasets. On the other hand, high-quality data for action and reasoning-centric edits is scarce and has to come from entirely different sources that cover e.g.


STimage-1K4M: A histopathology image-gene expression dataset for spatial transcriptomics

Neural Information Processing Systems

Recent advances in multi-modal algorithms have driven and been driven by the increasing availability of large image-text datasets, leading to significant strides in various fields, including computational pathology. However, in most existing medical image-text datasets, the text typically provides high-level summaries that may not sufficiently describe sub-tile regions within a large pathology image. For example, an image might cover an extensive tissue area containing cancerous and healthy regions, but the accompanying text might only specify that this image is a cancer slide, lacking the nuanced details needed for in-depth analysis. In this study, we introduce STimage-1K4M, a novel dataset designed to bridge this gap by providing genomic features for sub-tile images. STimage-1K4M contains 1,149 images derived from spatial transcriptomics data, which captures gene expression information at the level of individual spatial spots within a pathology image. Specifically, each image in the dataset is broken down into smaller sub-image tiles, with each tile paired with 15, 000 30, 000 dimensional gene expressions. With 4, 293, 195 pairs of sub-tile images and gene expressions, STimage-1K4M offers unprecedented granularity, paving the way for a wide range of advanced research in multi-modal data analysis an innovative applications in computational pathology, and beyond.


Staggered Rollout Designs Enable Causal Inference Under Interference Without Network Knowledge

Neural Information Processing Systems

Randomized experiments are widely used to estimate causal effects across many domains. However, classical causal inference approaches rely on independence assumptions that are violated by network interference, when the treatment of one individual influences the outcomes of others. All existing approaches require at least approximate knowledge of the network, which may be unavailable or costly to collect. We consider the task of estimating the total treatment effect (TTE), the average difference between the outcomes when the whole population is treated versus when the whole population is untreated. By leveraging a staggered rollout design, in which treatment is incrementally given to random subsets of individuals, we derive unbiased estimators for TTE that do not rely on any prior structural knowledge of the network, as long as the network interference effects are constrained to low-degree interactions among neighbors of an individual. We derive bounds on the variance of the estimators, and we show in experiments that our estimator performs well against baselines on simulated data. Central to our theoretical contribution is a connection between staggered rollout observations and polynomial extrapolation.